385 research outputs found

    High-Throughput 3D Homology Detection via NMR Resonance Assignment

    Get PDF
    One goal of the structural genomics initiative is the identification of new protein folds. Sequence-based structural homology prediction methods are an important means for prioritizing unknown proteins for structure determination. However, an important challenge remains: two highly dissimilar sequences can have similar folds --- how can we detect this rapidly, in the context of structural genomics? High-throughput NMR experiments, coupled with novel algorithms for data analysis, can address this challenge. We report an automated procedure, called HD, for detecting 3D structural homologies from sparse, unassigned protein NMR data. Our method identifies 3D models in a protein structural database whose geometries best fit the unassigned experimental NMR data. HD does not use, and is thus not limited by sequence homology. The method can also be used to confirm or refute structural predictions made by other techniques such as protein threading or homology modelling. The algorithm runs in O(pn5/2log(cn)+plogp)O(pn^{5/2} \log {(cn)} + p \log p) time, where pp is the number of proteins in the database, nn is the number of residues in the target protein and cc is the maximum edge weight in an integer-weighted bipartite graph. Our experiments on real NMR data from 3 different proteins against a database of 4,500 representative folds demonstrate that the method identifies closely related protein folds, including sub-domains of larger proteins, with as little as 10-30\% sequence homology between the target protein (or sub-domain) and the computed model. In particular, we report no false-negatives or false-positives despite significant percentages of missing experimental data

    An Improved Nuclear Vector Replacement Algorithm for Nuclear Magnetic Resonance Assignment

    Get PDF
    We report an improvement to the Nuclear Vector Replacement (NVR) algorithm for high-throughput Nuclear Magnetic Resonance (NMR) resonance assignment. The new algorithm improves upon our earlier result in terms of accuracy and computational complexity. In particular, the new NVR algorithm assigns backbone resonances without error (100% accuracy) on the same test suite examined in [Langmead and Donald J. Biomol. NMR 2004], and runs in O(n5/2log(cn))O(n^{5/2} \log {(cn)}) time where nn is the number of amino acids in the primary sequence of the protein, and cc is the maximum edge weight in an integer-weighted bipartite graph

    3D-Structural Homology Detection via Unassigned Residual Dipolar Couplings

    Get PDF
    Recognition of a protein\u27s fold provides valuable information about its function. While many sequence-based homology prediction methods exist, an important challenge remains: two highly dissimilar sequences can have similar folds --- how can we detect this rapidly, in the context of structural genomics? High-throughput NMR experiments, coupled with novel algorithms for data analysis, can address this challenge. We report an automated procedure for detecting 3D-structural homologies from sparse, unassigned protein NMR data. Our method identifies the 3D-structural models in a protein structural database whose geometries best fit the unassigned experimental NMR data. It does not use sequence information and is thus not limited by sequence homology. The method can also be used to confirm or refute structural predictions made by other techniques such as protein threading or sequence homology. The algorithm runs in O(pnk3) time, where p is the number of proteins in the database, n is the number of residues in the target protein, and k is the resolution of a rotation search. The method requires only uniform 15N-labelling of the protein and processes unassigned 1H-15N residual dipolar couplings, which can be acquired in a couple of hours. Our experiments on NMR data from 5 different proteins demonstrate that the method identifies closely related protein folds, despite low-sequence homology between the target protein and the computed model

    Error Detection and Recovery for Robot Motion Planning with Uncertainty

    Get PDF
    Robots must plan and execute tasks in the presence of uncertainty. Uncertainty arises from sensing errors, control errors, and uncertainty in the geometry of the environment. The last, which is called model error, has received little previous attention. We present a framework for computing motion strategies that are guaranteed to succeed in the presence of all three kinds of uncertainty. The motion strategies comprise sensor-based gross motions, compliant motions, and simple pushing motions

    High-Throughput Inference of Protein-Protein Interaction Sites from Unassigned NMR Data by Analyzing Arrangements Induced By Quadratic Forms on 3-Manifolds

    Get PDF
    We cast the problem of identifying protein-protein interfaces, using only unassigned NMR spectra, into a geometric clustering problem. Identifying protein-protein interfaces is critical to understanding inter- and intra-cellular communication, and NMR allows the study of protein interaction in solution. However it is often the case that NMR studies of a protein complex are very time-consuming, mainly due to the bottleneck in assigning the chemical shifts, even if the apo structures of the constituent proteins are known. We study whether it is possible, in a high-throughput manner, to identify the interface region of a protein complex using only unassigned chemical shift and residual dipolar coupling (RDC) data. We introduce a geometric optimization problem where we must cluster the cells in an arrangement on the boundary of a 3-manifold. The arrangement is induced by a spherical quadratic form, which in turn is parameterized by SO(3)xR^2. We show that this formalism derives directly from the physics of RDCs. We present an optimal algorithm for this problem that runs in O(n^3 log n) time for an n-residue protein. We then use this clustering algorithm as a subroutine in a practical algorithm for identifying the interface region of a protein complex from unassigned NMR data. We present the results of our algorithm on NMR data for 7 proteins from 5 protein complexes and show that our approach is useful for high-throughput applications in which we seek to rapidly identify the interface region of a protein complex

    A Probability-Based Similarity Measure for Saupe Alignment Tensors with Applications to Residual Dipolar Couplings in NMR Structural Biology

    Get PDF
    High-throughput NMR structural biology and NMR structural genomics pose a fascinating set of geometric challenges. A key bottleneck in NMR structural biology is the resonance assignment problem. We seek to accelerate protein NMR resonance assignment and structure determination by exploiting a priori structural information. In particular, a method known as Nuclear Vector Replacement (NVR) has been proposed as a method for solving the assignment problem given a priori structural information [24,25]. Among several different kinds of input data, NVR uses a particular type of NMR data known as residual dipolar couplings (RDCs). The basic physics of residual dipolar couplings tells us that the data should be explainable by a structural model and set of parameters contained within the Saupe alignment tensor. In the NVR algorithm, one estimates the Saupe alignment tensors and then proceeds to refine those estimates. We would like to quantify the accuracy of such estimates, where we compare the estimated Saupe matrix to the correct Saupe matrix. In this work, we propose a way to quantify this comparison. Given a correct Saupe matrix and an estimated Saupe matrix, we compute an upper bound on the probability that a randomly rotated Saupe tensor would have an error smaller than the estimated Saupe matrix. This has the advantage of being a quantified upper bound which also has a clear interpretation in terms of geometry and probability. While the specific application of our rotation probability results is given to NVR, our novel methods can be used for any RDC-based algorithm to bound the accuracy of the estimated alignment tensors. Furthermore, they could also be used in X-ray crystallography or molecular docking to quantitate the accuracy of calculated rotations of proteins, protein domains, nucleic acids, or small molecules

    SAR by MS for Functional Genomics (Structure-Activity Relation by Mass Spectrometry)

    Get PDF
    Large-scale functional genomics will require fast, high-throughput experimental techniques, coupled with sophisticated computer algorithms for data analysis and experiment planning. In this paper, we introduce a combined experimental-computational protocol called Structure-Activity Relation by Mass Spectrometry (SAR by MS), which can be used to elucidate the function of protein-DNA or protein-protein complexes. We present algorithms for SAR by MS and analyze their complexity. Carefully-designed Matrix-Assisted Laser Desorption/Ionization Time-Of-Flight (MALDI TOF) and Electrospray Ionization (ESI) assays require only femtomolar samples, take only microseconds per spectrum to record, enjoy a resolution of up to one dalton in 10610^6, and (in the case of MALDI) can operate on protein complexes up to a megadalton in mass. Hence, the technique is attractive for high-throughput functional genomics. In SAR by MS, selected residues or nucleosides are 2H-, 13C-, and/or 15N-labeled. Second, the complex is crosslinked. Third, the complex is cleaved with proteases and/or endonucleases. Depending on the binding mode, some cleavage sites will be shielded by the crosslinking. Finally, a mass spectrum of the resulting fragments is obtained and analyzed. The last step is the Data Analysis phase, in which the mass signatures are interpreted to obtain constraints on the functional binding mode. Experiment Planning entails deciding what labeling strategy and cleaving agents to employ, so as to minimize mass degeneracy and spectral overlap, in order that the constraints derived in data analysis yield a small number of binding hypotheses. A number of combinatorial and algorithmic questions arise in deriving algorithms for both Experiment Planning and Data Analysis. We explore the complexity of these problems, obtaining upper and lower bounds. Experimental results are reported from an implementation of our algorithms

    Importance of CSF-based Aβ clearance with age in humans increases with declining efficacy of blood-brain barrier/proteolytic pathways

    Get PDF
    The kinetics of amyloid beta turnover within human brain is still poorly understood. We previously found a dramatic decline in the turnover of Aβ peptides in normal aging. It was not known if brain interstitial fluid/cerebrospinal fluid (ISF/CSF) fluid exchange, CSF turnover, blood-brain barrier function or proteolysis were affected by aging or the presence of β amyloid plaques. Here, we describe a non-steady state physiological model developed to decouple CSF fluid transport from other processes. Kinetic parameters were estimated using: (1) MRI-derived brain volumes, (2) stable isotope labeling kinetics (SILK) of amyloid-β peptide (Aβ), and (3) lumbar CSF Aβ concentration during SILK. Here we show that changes in blood-brain barrier transport and/or proteolysis were largely responsible for the age-related decline in Aβ turnover rates. CSF-based clearance declined modestly in normal aging but became increasingly important due to the slowing of other processes. The magnitude of CSF-based clearance was also lower than that due to blood-brain barrier function plus proteolysis. These results suggest important roles for blood-brain barrier transport and proteolytic degradation of Aβ in the development Alzheimer\u27s Disease in humans

    The NOESY Jigsaw: Automated Protein Secondary Structure and Main-Chain Assignment from Sparse, Unassigned NMR Data

    Get PDF
    High-throughput, data-directed computational protocols for Structural Genomics (or Proteomics) are required in order to evaluate the protein products of genes for structure and function at rates comparable to current gene-sequencing technology. This paper presents the Jigsaw algorithm, a novel high-throughput, automated approach to protein structure characterization with nuclear magnetic resonance (NMR). Jigsaw consists of two main components: (1) graph-based secondary structure pattern identification in unassigned heteronuclear NMR data, and (2) assignment of spectral peaks by probabilistic alignment of identified secondary structure elements against the primary sequence. Jigsaw\u27s deferment of assignment until after secondary structure identification differs greatly from traditional approaches, which begin by correlating peaks among dozens of experiments. By deferring assignment, Jigsaw not only eliminates this bottleneck, it also allows the number of experiments to be reduced from dozens to four, none of which requires 13C-labeled protein. This in turn dramatically reduces the amount and expense of wet lab molecular biology for protein expression and purification, as well as the total spectrometer time to collect data. Our results for three test proteins demonstrate that we are able to identify and align approximately 80 percent of alpha-helical and 60 percent of beta-sheet structure. Jigsaw is extremely fast, running in minutes on a Pentium-class Linux workstation. This approach yields quick and reasonably accurate (as opposed to the traditional slow and extremely accurate) structure calculations, utilizing a suite of graph analysis algorithms to compensate for the data sparseness. Jigsaw could be used for quick structural assays to speed data to the biologist early in the process of investigation, and could in principle be applied in an automation-like fashion to a large fraction of the proteome

    Economic Analysis of Labor Markets and Labor Law: An Institutional/Industrial Relations Perspective

    Get PDF
    corecore